Skip to content

DOC: Added examples to pandas.concat documentation #15028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jan 4, 2017
Merged

DOC: Added examples to pandas.concat documentation #15028

merged 16 commits into from
Jan 4, 2017

Conversation

palewire
Copy link
Contributor

@palewire palewire commented Jan 1, 2017

I've added two simple examples to the pd.concat documentation. I've also done a very small amount of clean up on the rest of the docstring.

  • passes git diff upstream/master | flake8 --diff

Returns
-------
concatenated : type of objects

Notes
-----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are going to add examples, then this needs to be more complete, using examples from here:http://pandas.pydata.org/pandas-docs/stable/merging.html#concatenating-objects

certainly not all of these, but a quick selection (as well as having this link would be fine).

@jreback jreback added the Docs label Jan 1, 2017
@palewire
Copy link
Contributor Author

palewire commented Jan 1, 2017 via email

Combine two ``Series``.

>>> import pandas as pd
>>> s1 = pd.Series(['a', 'b'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no indentation is needed here for the example blocks

(and the 'import pandas as pd' can also be left out -> this is assumed)

@@ -705,7 +705,7 @@ def _get_join_info(self):
_left_join_on_index(left_ax, right_ax, self.left_join_keys,
sort=self.sort)

elif self.left_index and self.how == 'right':
elif self.tleft_index and self.how == 'right':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls don't change code in the same PR as docs (unless its germane), I assume this was a typo though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was. I'm sorry for the oversight.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice examples! Added some more comments

3 d
dtype: object

Add a ``hierarchical index`` at the outermost level of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to add backtick quotes around 'hierarchical index'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this.

1 d
2 e
dtype: object
>>> c.ix['s1']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above example is good. But I would leave out the examples on indexing the result to keep the examples focused on concat (such examples would certainly be nice for the tutorial docs)

Maybe add an example with axis=1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the ix calls and added axis=1 to my list for later.


>>> s1 = pd.Series(['a', 'b', 'c'])
>>> s2 = pd.Series(['c', 'd', 'e'])
>>> c = pd.concat([s1, s2], keys=["s1", 's2',])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use consistent quotes within the keys list (eg always single quotes)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this.

@codecov-io
Copy link

codecov-io commented Jan 3, 2017

Current coverage is 84.75% (diff: 100%)

Merging #15028 into master will decrease coverage by 0.01%

@@             master     #15028   diff @@
==========================================
  Files           145        145          
  Lines         51131      51146    +15   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43345      43350     +5   
- Misses         7786       7796    +10   
  Partials          0          0          

Powered by Codecov. Last update 3662413...c58f9f8

@palewire
Copy link
Contributor Author

palewire commented Jan 3, 2017

@jorisvandenbossche and @jreback, I believe I've made the changes you've asked for. Please let me know if there's anything more I can do.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some stylistic comments on the examples you added.

by setting the ``ignore_index`` option to ``True``.

>>> s1 = pd.Series(['a', 'b'])
>>> s2 = pd.Series(['c', 'd'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave out the redefinition of s1 and s2 ? (to save some vertical space) We can assume that the example build on each other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make this change and push it shortly.

the data with the ``keys`` option.

>>> s1 = pd.Series(['a', 'b', 'c'])
>>> s2 = pd.Series(['c', 'd', 'e'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well (you can use the same s1 and s2 I think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change coming.

... [s1, s2],
... keys=['s1', 's2',],
... names=['Series name', 'Row ID']
... )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style-wise, can you do

pd.concat([s1, s2], keys=[...],
          names=[...])

to have a more consistent style within the docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I was unaware that was the established style.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, no problem. We don't have this documented clearly what is expected here (and are not always that consistent in existing docs)

>>> df1 = pd.DataFrame(
... [['a', 1], ['b', 2]],
... columns=['letter', 'number']
... )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same here regarding style

>>> df2 = pd.DataFrame(
... [['c', 3, 'cat'], ['d', 4, 'dog']],
... columns=['letter', 'number', 'animal']
... )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here, I would leave out the redefinition and display of the same dataframes (then what is shown is more focused on the result)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant this comment actually for the example below, as here df2 is actually changed (maybe only show df2 creation and display in this example)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

1 b 2
>>> df2 = pd.DataFrame(
... [['c', 3], ['d', 4]],
... columns=['letter', 'number']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use here eg the 'animal' column name to have different columns (that is IMO a more natural example to concat columns with different names)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

@palewire
Copy link
Contributor Author

palewire commented Jan 3, 2017

@jorisvandenbossche I believe I have made the changes you've requested. Please let me know if there's more I can do.

@jorisvandenbossche jorisvandenbossche merged commit 7c0b4a3 into pandas-dev:master Jan 4, 2017
@jorisvandenbossche
Copy link
Member

@palewire Thanks!

@jorisvandenbossche jorisvandenbossche added this to the 0.20.0 milestone Jan 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants